Automatic Extraction of Aspectual Information from a Monolingual Corpus
نویسندگان
چکیده
This paper describes an approach to extract the aspectual information of Japanese verb phrases from a monolingual corpus. We classify Verbs into six categories by means of the aspectual features which are defined on the basis of the possibility of co-occurrence with aspectual forms and adverbs. A unique category could be identified for 96% of the target verbs. To evaluate the result of the experiment, we examined the meaning of -leiru which is one of the most fundamental aspectual markers in Japanese, and obtained the correct recognition score of 71% for the 200 sentences.
منابع مشابه
Automatic Induction of German Aspectual Verb Classes in a Distributional Framework
The central question of this study is whether aspectual verb classes (Vendler, 1967) can be induced from corpus data in a fully automatic, distributionally motivated procedure. We propose an operationalization of ‘aspectivity’ utilizing distributional information about nominal fillers in the argument positions of verbs in combination with aspectual features automatically derived from dependency...
متن کاملAutomatic Identification of Aspectual Classes across Verbal Readings
The automatic prediction of aspectual classes is very challenging for verbs whose aspectual value varies across readings, which are the rule rather than the exception. This paper sheds a new perspective on this problem by using a machine learning approach and a rich morpho-syntactic and semantic valency lexicon. In contrast to previous work, where the aspectual value of corpus clauses is determ...
متن کاملAutomatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation
Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...
متن کاملAutomatic Discovery of Similar Words
We deal with the issue of automatic discovery of similar words (synonyms and near-synonyms) from different kind of sources: from large corpora of documents, from the Web, and from monolingual dictionaries. We present in detail three algorithms that extract similar words from a large corpus of documents and consider the specific case of the World Wide Web. We then describe a recent method of aut...
متن کاملSynonymous Collocation Extraction Using Translation Information
Automatically acquiring synonymous collocation pairs such as and from corpora is a challenging task. For this task, we can, in general, have a large monolingual corpus and/or a very limited bilingual corpus. Methods that use monolingual corpora alone or use bilingual corpora alone are apparently inadequate because of low precision or low coverage. I...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997